214        Bioinformatics

conditions were found to be linked to epigenetic mechanisms. An epigenome consists of

all epigenetic modifications, with the genome of an organism, that regulate the activity

(expression) of the genes; these modifications can be passed down to an organism’s off-

spring [1]. Epigenomics is the study of the complete set of epigenetic modifications on the

genome of an organism. Researchers use chromatin immunoprecipitation (ChIP) to exam-

ine the interactions between epigenetic components (proteins and DNA) and profiling of

DNA methylations in their original context. ChIP are used to identify specific genes and

sequences where a protein of interest binds, across the entire genome, providing critical

information about their regulatory functions and mechanisms. The laboratory protocol

of the ChIP includes fixation of in-vivo chromatin-bound proteins with formaldehyde to

stabilize the protein on the chromatin, and then sonication or restriction enzymes are

used to cut chromatins into short random fragments usually around 200 bp. ChIP can be

performed without formaldehyde crosslinking by digesting chromatins with micrococcal

nuclease, which is an enzyme that can break chromatins into the desired fragment size

(Native ChIP). Antibodies specific for the proteins of interest are added to the short chro-

matin fragment. These enzymes form immunoprecipitated DNA–protein–antibody com-

plexes that can be separated from the non-immunoprecipitated chromatin (DNA without

protein of interest) using beads. The cross-linked formaldehyde is then removed either by

heating or by digesting the protein component of the chromatins. In the final step, only

the DNA fragments, to which proteins of interest were bound, are isolated and purified.

Up to this step, we would have the DNA fragments that were the target for the epigenetic

modification. The next step is to characterize these fragments by identifying the sequences

of the binding sites and the affected genes and that provides important information about

the binding sites of the transcription factors (TFs), function and regulation of the genes,

and the impact of the activities of the genes on the condition studied.

6.2  CHIP SEQUENCING

Researchers use different techniques to characterize the ChIP purified DNA fragments.

Northern blot, polymerase chain reaction (PCR), and microarray are some of the methods.

However, only recently, sequencing these fragments with high-throughput methods has

become the most commonly used and effective method for studying epigenetic modifica-

tions. The technique of sequencing the DNA fragments isolated from immunoprecipita-

tion is called ChIP-Seq.

Highly specific antibodies to the targeted proteins are used for the ChIP-Seq so that only

the DNA fragments affected by the epigenetic modifications are isolated. The ChIP-Seq

also requires control DNA fragments extracted from the same samples but from the DNA

regions that were not affected by the epigenetic modification (non-immunoprecipitated

chromatin fragments) or the input DNA purified from the fragmented chromatin before

the antibody incubation step. The control DNA serves as a baseline to normalize the ChIP

data. Normalization with control DNA data can reduce false positives originated from

biases that cause overrepresentation of reads. Possible source of bias includes the nonuni-

form fragmentation during sonication (sonication bias), PCR amplification which tends to

over-amplify GC-rich regions (PCR bias), sequencing bias, and mapping bias.